Skunkware 98

home *** CD-ROM | disk | FTP | other *** search

/ Skunkware 98 / Skunkware 98.iso / src / sgml / sgml2latex-format.1.3.tar.Z / sgml2latex-format.1.3.tar / doc / qwertz.sgml < prev

Wrap

SGML Document | 1993-11-24 | 77KB

<!doctype qwertz system [ <!entity LaTeX sdata "{\LaTeX}"> <!entity TeX sdata "{\TeX}" > <!entity et "&etago;"> <!entity bigcup "<mc>\bigcup</>"> <!entity l "["> <!entity r "]"> ]> <chapt>The <tt>qwertz</> Document Type Definition All of the <tt>qwertz</> document "styles", except bibliographies, are defined in a single SGML document type definition (DTD), called <tt>qwertz</>. It is essentially a SGML reconstruction of Lamport's &LaTeX <cite id="Lamport86">. We have not attempted to include every feature of &LaTeX in this DTD, but have included the features we use regularly. Others may of course find that something they deem important is missing. We welcome suggestions for improvements or extensions. We will be making use of several <em/parameter entities/ in this DTD: <code> <!entity % emph " em | it | bf | sf | sl | tt " > <!entity % xref " label | ref | pageref | cite | ncite " > <!entity % inline " (#pcdata | f | x | %emph; | sq | %xref)* " > <!entity % list " list | itemize | enum | descrip " > <!entity % par " %list; | comment | lq " > <!entity % mathpar " dm | eq " > <!entity % thrm " def | prop | lemma | coroll | proof | theorem " > <!entity % litprog " code | verb " > <!entity % sectpar " %par; | figure | tabular | table | %mathpar; | %thrm; | %litprog; "> </code> These are just macros used in the definitions of various elements, to avoid retyping and to ease maintenance. The <tt/emph/ parameter lists the various kinds of emphasis. The <tt/inline/ parameter is for the elements which may be used anywhere within the document. The <tt/list/ parameter is for various kinds of lists. <tt/par/ lists several basic kinds of elements at the level of paragraphs. The <tt>mathpar</> parameter includes the elements for <em/displayed/ mathematical formulas. The <tt>thrm</> parameter is for the set of elements used to represent such things as definitions, theorems and proofs. The <tt>litprog</> parameter is for literate programming elements. Finally, the <tt>sectpar</> parameter lists the elements which may occur at the level of paragraphs within sections (or chapters). Notice that this parameter uses other parameters. Several kinds of documents may be written using &LaTeX: articles, reports, books, letters and slide (or transparency) presentations. The <tt/qwertz/ DTD includes two others as well: <tt/notes/, for documents such as notes to yourself which do not require a title, sections, footnotes and the like; and <tt/manpage/, for Unix manual pages. <code> <!element qwertz o o (sect | chapt | article | report | book | letter | telefax | slides | notes | manpage ) > </code> Notice that sections (<tt>sect</>) and chapters (<tt>chapt</>) may also be processed separately, before being put together into an article, report or book. &LaTeX also includes Bib&TeX, a program for creating bibliographies whose entries can be easily cited in &LaTeX documents. The <tt/qwertz/ document type for this purpose is described in Chapter 5. <sect>General Purpose Entities and Elements</> This section describes the SGML entities and elements available in all <tt/qwertz/ documents. <code> <!entity % general system -- general purpose characters -- > %general; </code> <sect1>Characters Entities</> Most characters are created just by typing the character wanted on the keyboard. This simple method does not suffice when the character wanted isn't in the character set available, or at least not associated with a key on the keyboard, or when the character currently has special meaning to SGML or, perhaps, &TeX;. In this section, a fairly large number of general purpose character entities will be presented. Symbols and characters which may be used only in mathematical formulas will be discussed separately, in section <ref id="math">. When may it be necessary to use of an entity reference to produce some character? There are three cases to watch out for: <descrip> <tag>SGML Concrete Syntax Delimiters.</> Although the SGML standard allows alternative concrete syntaxes to be defined, we use the so-called reference concrete syntax</> in the <tt>qwertz</> document types. In this reference syntax, < is the start tag open</> character, and <tt></</> is the end tag open</> delimiter. The other SGML delimiter authors should be aware of is &, the entity reference open</> delimiter of the reference syntax. The appropriate entity to use to generate these characters depends on the context. Normally, use <tt>lt</> to represent < and <tt>amp</> to get &, when these appear in strings which might otherwise be interpreted as starting tags or entity references. However, within the <tt>code</> or <tt>verb</> elements for literate programming, described in section <ref id="litprog">, use the <tt>ero</> entity to represent & and the <tt>etago</> entity for the sequence <tt>&etago</>. <verb> <!entity lt sdata "<" > <!entity amp sdata "&" > <!entity ero sdata "&ero;" > <!entity etago sdata "&etago;" > </verb> <tag>SGML Short Reference Delimiters.</> In SGML document types short reference maps</> may be defined which allow single characters to be interpreted as arbitrarily complex sequences of characters, including SGML tags and entity references. Thus, to know precisely when a certain character will be interpreted literally or as a short reference (i.e. macro) for something else, one has to know which map is in effect in the context of the current element. Just about all punctuation characters which are not used as delimiters in the concrete syntax can be used as short reference delimiters: <verb> " # % ' ( ) * + , - : ; = @ [ ] ^ _ { | } ~ </verb> For each of these characters, there is an SGML entity which may be used to generate the ASCII character in the printed document, listed in table <ref id="GPC">. Usually, it will not be necessary to use these entities; the character can simply be typed and will be interpreted literally.</> However, if the results are not as expected, check to see if there is a map in effect at that point in the document in which the character has been redefined. As maps are associated with elements, the section in this manual describing an element will also direct you to a description of the applicable map, if there is one. As it turns out, one important use of character maps is to generate exactly the character typed in the printed document. That is, the map is used to hide the special meaning of the character to the underlying formatter (e.g. &TeX;), replacing the character with the formatting instructions for generating the character. This has been the main use of maps in our <tt>qwertz</> document type definitions. <verb> <!entity dquot sdata "&dquot;" > <!entity num sdata "#" > <!entity percnt sdata "%" > <!entity quot sdata """ > <!entity lpar sdata "(" > <!entity rpar sdata ")" > <!entity ast sdata "*" > <!entity plus sdata "+" > <!entity comma sdata "," > <!entity hyphen sdata "‐" > <!entity colon sdata ":" > <!entity semi sdata ";" > <!entity equals sdata "=" > <!entity commat sdata "@" > <!entity lsqb sdata "[" > <!entity rsqb sdata "]" > <!entity circ sdata "ˆ" > <!entity lowbar sdata "_" > <!entity lcub sdata "{" > <!entity verbar sdata "|" > <!entity rcub sdata "}" > <!entity tilde sdata "˜" > </verb> <tag>&TeX Special Characters.</> Ideally, it should be possible to hide the conventions of the underlying formatting system completely. In fact, SGML parsers which implement the full ISO standard have a feature which makes this possible. However, the SGML parser we are using does not include this feature: the only characters which can serve as short references are the characters allowed for this purpose by the reference concrete syntax. Unfortunately, this reference syntax does not allow &, &dollar and &bsol to be used as short references, which are all special &TeX characters. Thus, the entities for these three characters (<tt>amp, dollar</> and <tt>bsol</>) must usually be used to produce them. (The &dollar and &bsol characters may be used directly within the <tt>verb</> and <tt>code</> elements, discussed below in section <ref id="litprog">. Also, within these elements use the <tt>ero</> entity to represent & in strings which might otherwise be interpreted as entity references.) <verb> <!entity bsol sdata "\" > <!entity dollar sdata "$" > </verb> </descrip> <sect1>Spacing, Dashes and Ellipsis</> The meaning of the ordinary space character is context sensitive. Sometimes there is a space within</> a single word. Such spaces can be typed using the nonbreakable space</> (<tt>nbsp</>) entity to avoid breaking the word at that point at the end of line. There are also contexts where one wants a certain amount of space to appear, without it being regarded by the formatter as being space which may be shrunk in order to clean-up the arrangement of words or characters on the line. There are three entities for this purpose: <tt>emsp</> denotes the amount of horizontal space required for the character "M". An <tt>ensp</> is just half as wide as an <tt>emsp</>, and a thin space</> (<tt>thinsp</>) is <f>1/6</> of an <tt>emsp</>. Notice that these are relative amounts, depending on the font being used. There are also three different kinds of dashes: <tt>hyphen</>, which was already mentioned above, is to be used for intra-word dashes, as in the word "intra-word".<footnote>However, the <tt>hyphen</> entity was not actually necessary here, as the - character was not being used in this context as a short reference.</footnote> <tt>ndash</> is to be used for number ranges, such as "23–56", and <tt>mdash</> is an alternative delimiter for parenthetical comments &mdash certainly you've seen them used this way &mdash perhaps to avoid too frequent use of commas or parentheses. <verb> <!entity nbsp sdata " " > <!entity emsp sdata " " > <!entity ensp sdata " " > <!entity thinsp sdata " " > <!entity mdash sdata "—" > <!entity ndash sdata "–" > <!entity hellip sdata "…" > </verb> <sect1>Foreign Languages</> There are a large set of entities for other Western European languages. Altogether, there are entities for almost all of the foreign language characters in ISO 8859, the Latin 1 character set for Western European languages.<footnote>Only the four Icelandic characters are missing.</> Conveniently, these entities are all available in the usual Adobe PostScript fonts, as well as in &TeX;. Thus, all of the entities defined here can be printed in &TeX;, on PostScript printers, or displayed on any Latin 1 device. Depending on the computer and editor, it may also be possible to type these Latin 1 characters directly, instead of having to use these entities. A simple filter could translate Latin 1 files into ASCII files, replacing non-ASCII characters by entity references. The entity names chosen here for these characters conform to the SGML standard. <verb> <!entity aacute sdata 'á' > <!entity Aacute sdata 'Á' > <!entity acirc sdata 'â' > <!entity Acirc sdata 'Â' > <!entity agrave sdata 'à' > <!entity Agrave sdata 'À' > <!entity aring sdata 'å' > <!entity atilde sdata 'ã' > <!entity Atilde sdata 'Ã' > <!entity auml sdata 'ä' > <!entity Auml sdata 'Ä' > <!entity aelig sdata 'æ' > <!entity AElig sdata 'Æ' > <!entity ccedil sdata 'ç' > <!entity Ccedil sdata 'Ç' > <!entity eacute sdata 'é' > <!entity Eacute sdata 'É' > <!entity ecirc sdata 'ê' > <!entity egrave sdata 'è' > <!entity Egrave sdata 'È' > <!entity euml sdata 'ë' > <!entity Euml sdata 'Ë' > <!entity iacute sdata 'í' > <!entity Iacute sdata 'Í' > <!entity icirc sdata 'î' > <!entity Icirc sdata 'Î' > <!entity igrave sdata 'ì' > <!entity Igrave sdata 'Ì' > <!entity iuml sdata 'ï' > <!entity Iuml sdata 'Ï' > <!entity ntilde sdata 'ñ' > <!entity Ntilde sdata 'Ñ' > <!entity oacute sdata 'ó' > <!entity Oacute sdata 'Ó' > <!entity ocirc sdata 'ô' > <!entity Ocirc sdata 'Ô' > <!entity ograve sdata 'ò' > <!entity Ograve sdata 'Ò' > <!entity oslash sdata 'ø' > <!entity Oslash sdata 'Ø' > <!entity otilde sdata 'õ' > <!entity ouml sdata 'ö' > <!entity Ouml sdata 'Ö' > <!entity szlig sdata 'ß' > <!entity uacute sdata 'ú' > <!entity Uacute sdata 'Ú' > <!entity ucirc sdata 'û' > <!entity ugrave sdata 'ù' > <!entity Ugrave sdata 'Ù' > <!entity uuml sdata 'ü' > <!entity Uuml sdata 'Ü' > <!entity yacute sdata 'ý' > <!entity Yacute sdata 'Ý' > <!entity yuml sdata 'ÿ' > </verb> The <tt>qwertz</> document types were developed in a German research center, so we have included entities for the German characters with shorter names than the entity names used in the SGML standard. Notice that these are just synonyms for the standard entities, which are also included. <code> <!entity Ae '&ero;Auml;' > <!entity ae '&ero;auml;' > <!entity Oe '&ero;Ouml;' > <!entity oe '&ero;ouml;' > <!entity Ue '&ero;Uuml;' > <!entity ue '&ero;uuml;' > <!entity sz '&ero;szlig;' > </code> <sect1>Other Symbols</> Finally, there are entities for a few miscellaneous symbols, such as §, ¶, ©, ¬, ÷, ±, ×, and μ. All of these entities name symbols in the Latin 1 character set. They may be used anywhere within a document. (In particular, the mathematical symbols shown here need not be within one of the formula elements described below, in section <ref id="math">.) The entity names for these, and all the other character entities discussed above, are listed in table <ref id="GPC">. <em/A document which does not include mathematical formulas or graphics and which uses only the character entities defined in this chapter can be displayed or printed using a single Latin 1 font/. <verb> <!entity gt sdata ">" > <!entity sect sdata "§"> <!entity para sdata "¶"> <!entity copy sdata "©"> <!entity iexcl sdata "¡" > <!entity iquest sdata "¿" > <!entity cent sdata "¢" > <!entity pound sdata "£" > <!entity not sdata "¬" > <!entity divide sdata "÷" > <!entity plusmn sdata "±" > <!entity times sdata "×" > <!entity mu sdata "μ" > </verb> <table> <tabular ca="ll|ll|ll|ll"> AElig | Æ | Aacute | Á | Acirc | Â | Ae | &Ae @ Agrave | À | Atilde | Ã | Auml | Ä | Ccedil | Ç @ Eacute | É | Egrave | È | Euml | Ë | Iacute | Í @ Icirc | Î | Igrave | Ì | Iuml | Ï | Ntilde | Ñ @ Oacute | Ó | Ocirc | Ô | Oe | &Oe | Ograve | Ò @ Oslash | Ø | Ouml | Ö | Uacute | Ú | Ue | &Ue @ Ugrave | Ù | Uuml | Ü | Yacute | Ý | aacute | á @ acirc | â | ae | &ae | aelig | æ | agrave | à @ amp | & | aring | å | ast | &ast | atilde | ã @ auml | ä | bsol | &bsol | ccedil | ç | cent | ¢ @ circ | &circ | colon | &colon | comma | &comma | commat | &commat @ copy | © | divide | ÷ | dollar | &dollar | dquot | &dquot @ eacute | é | ecirc | ê | egrave | è | emsp | @ ensp | | equals | &equals | euml | ë | gt | > @ hellip | &hellip | hyphen | &hyphen | iacute | í | icirc | î @ iexcl | ¡ | igrave | ì | iquest | ¿ | iuml | ï @ lcub | &lcub | lowbar | &lowbar | lpar | &lpar | lsqb | &lsqb @ lt | < | mdash | &mdash | mu | &mu | nbsp | @ ndash | &ndash | not | ¬ | ntilde | ñ | num | &num @ oacute | ó | ocirc | ô | oe | &oe | ograve | ò @ oslash | ø | otilde | õ | ouml | ö | para | ¶ @ percnt | &percnt | plus | &plus | plusmn | ± | pound | £ @ quot | " | rcub | &rcub | rpar | &rpar | rsqb | &rsqb @ sect | § | semi | &semi | sz | &sz | szlig | ß @ thinsp | | tilde | &tilde | times | × | uacute | ú @ ucirc | û | ue | &ue | ugrave | ù | uuml | ü @ verbar | &verbar | yacute | ý | yuml | ÿ | </tabular> <caption><label id="GPC">General Purpose Characters</caption> </table> <sect1>Sentences, Paragraphs, Emphasis and Quotations Sentences need not be marked up with tags. There is no <tt>sentence</> element as such. Rather, these are marked implicitly using the usual conventions for beginning and ending sentences. Paragraphs are delimited with the <tt/p/ tag. Both the starting tag and ending tag are optional. <code> <!element p o o ( %inline | %sectpar )+ > <!entity ptag '' > <!entity psplit '&etago;p>' > <!shortref pmap "&ero;#RS;B" null "&ero;#RS;B&ero;#RE;" psplit "&ero;#RS;&ero;#RE;" psplit '"' qtag "[" ftag "~" nbsp "_" lowbar "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar > <!usemap pmap p> </code> Sentences or phrases within paragraphs can be emphasized in a number of ways. The <tt>em</> tag is used to choose the default form of emphasis, which is usually italic</> type, but depends on the style of the background text. If the background text is formatted in italics type, as it usually is in definitions, for example, than emphasized text will be formatted using a plain, roman typeface. However, various forms of emphasis can be explicitly chosen. These include: <bf>bold face</> (<tt>bf</>), <it>italics</> (<tt>it</>), <sf>sans serif</> (<tt>sf</>), <sl>slanted</> (<tt>sl</>), and <tt>typewriter</> (<tt>tt</>) styles. <code> <!element em - - (%inline)> <!element bf - - (%inline)> <!element it - - (%inline)> <!element sf - - (%inline)> <!element sl - - (%inline)> <!element tt - - (%inline)> </code> The <tt>tt</> element simulates a "typewriter". That is, with a couple of exceptions, characters are printed exactly as they appear on the display. This is useful for including small segments of computer code within paragraphs. See the section on literate programming for more information, <ref id="litprog">. Sentences within paragraphs can be quoted using the short quote</>, (<tt>sq</>) tag, as in <tt><sq>The rain in Spain falls mainly on the plain.</></tt>, but this is usually not necessary. In most contexts where one will want to use quotations, there is a map allowing the &dquot symbol to be used as a short reference for both the starting and ending <tt>sq</> tags. So one can just type: <tt>"The rain in Spain falls mainly on the plain."</> Quotations extending over a number of paragraphs are marked using the long quote</> (<tt>lq</>) element. Long quotes are formatted in &LaTeX by indenting the left and right margins. For example, <ncite id="Lamport86" note="pp. xiii">: <lq> The &LaTeX document preparation system is a special version of Donald Knuth's &TeX program. &TeX is a sophisticated program designed to produce high-quality typesetting, especially for mathematical text. &hellip &LaTeX represents a balance between functionality and ease of use. Since I implemented most of it myself, there was also a continual compromise between what I wanted to do and what I could do in a reasonable amount of time. &hellip </lq> <code> <!element sq - - (%inline)> <!entity ftag '<f>' -- formula begin -- > <!entity qendtag '&et;sq>'> <!shortref sqmap "&ero;#RS;B" null '"' qendtag "[" ftag "~" nbsp "_" lowbar "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar > <!usemap sqmap sq > <!element lq - - (p*)> </code> <sect1>Lists Four types of lists are supported, which differ according to the type of label used to mark each item in the list. Use <tt>itemize</> to create a list in which each item is marked with some symbol such as a dash or bullet. The <tt>enum</> tag is used to create an enumeration, i.e. a list in which each item is labelled with a number (or letter) indicating its rank or position in the list. The <tt/list/ type of list does not label the items at all. Finally, use <tt>descrip</> to create a list in which each item is labelled by some tag of your own choice. Lists of various types can nested. For example: <verb> <itemize> <item> A level one item. <item> Here's level two: <enum> <item> A level two item. <item> Here's level three: <enum> <item> A level three item. <item>Here's level four: <descrip> <tag/Red./ Is the color of my true love's hair. <tag/Blue./ Is a property of some movies. <tag/Yellow./ Characterizes some forms of journalism. &et;descrip> <item>A last level three item &et;enum> <item>A last level two item &et;enum> <item>A last level one item. &et;itemize> </verb> This is formatted by &LaTeX; as: <itemize> <item> A level one item. <item> Here's level two: <enum> <item> A level two item. <item> Here's level three: <enum> <item> A level three item. <item>Here's level four: <descrip> <tag/Red./ Is the color of my true love's hair. <tag/Blue./ Is a property of some movies. <tag/Yellow./ Characterizes some forms of journalism. </descrip> <item>A last level three item </enum> <item>A last level two item </enum> <item>A last level one item. </itemize> <code> <!element itemize - - (item+)> <!element list - - (item+)> <!element enum - - (item+)> <!element descrip - - ((tag?, (%inline; | %sectpar;)*, p*)+) > <!element item o o ((%inline; | %sectpar;)*, p*) > <!element tag - o (%inline)> <!usemap global (list,itemize,enum,descrip)> </code> For reasons having to do with our translation into &LaTeX, line feeds within <tt>tag</> elements are translated into spaces, using the <tt>oneline</> short reference map:  <code> <!entity space " "> <!entity null ""> <!shortref oneline "&ero;#RS;&ero;#RE;" null "&ero;#RS;B&ero;#RE;" null '"' qtag "[" ftag "~" nbsp "_" lowbar "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar> <!usemap oneline tag> </code> <sect1>Figures and Tables</> Figures and tables are floating elements; they may appear at a different location in the printed version of the document than in the input file. There is a location (<tt/loc/) attribute, which can be used to influence the location chosen by the formatter. The value of the <tt/loc/ attribute is a string of up to four letters, where each letter declares a location at which the figure or table may appear, as follows: <descrip> <tag/<tt/h/./ At the same relative location as it appears in the SGML input file (i.e. <em/here/). <tag/<tt/t/./ At the <em/top/ of a page. <tag/<tt/b/./ At the <em/bottom/ of a page. <tag/<tt/p/./ On a separate <em/page/ containing only figures and tables. </descrip> The default value of the <tt/loc/ attribute is <tt/tbp/. A <tt>figure</> is a graphic combined with an optional caption. Two types of figures are currently supported. The first, and easiest, is to use the <tt>eps</> tag to include an Encapsulated PostScript file in the document. Encapsulated PostScript files are centered horizontally on the page. The size of the graphic is its "natural" size; i.e. the size it would have if printed directly on a PostScript printer. You need only know the name of the file containing the graphic. Encapsulated PostScript graphics can be created using a variety of different editors. If you are using Unix with an X11-based graphical user-interface, you may want to try <tt>idraw</>, which stores its documents directly as Encapsulated PostScript files. Other interesting X11-based drawing program are <tt/xfig/ and <tt/tgif/. For example, to include the graphic contained in an Encapsulated PostScript file named <tt>issues.ps</>, you would type: <verb> <figure> <eps file="issues"> <caption>An <tt>idraw&et;> Drawing &et;> &et;figure> </verb> Which would then appear as in figure <ref id="issues">. <figure> <eps file="issues"> <caption><label id="issues">An <tt>idraw</> Drawing </> </figure> Notice that the ".ps" extension is not</> to be included in the file attribute of the <tt>eps</> element, but that the actual file must include the ".ps" extension. The second possibility is to use the <em/placeholder/ (<tt>ph</>) tag to leave space in which to later paste the graphic, in the old, reliable manner. For example, to leave 10 cm space for some graphic, type: <verb> <figure> <ph vspace="10cm"> &et;figure> </verb> Be sure not to leave a space between the number and the unit of measurement used, which may be <tt>cm</>, <tt>mm</> or <tt>in</>. <code> <!element figure - - ((eps | ph ), caption?)> <!attlist figure loc cdata "tbp"> <!element eps - o empty > <!attlist eps file cdata #required> <!element ph - o empty > <!attlist ph vspace cdata #required> <!element caption - o (%inline)> <!usemap oneline caption> </code> Next, there is a <tt>tabular</> element. Using &LaTeX;, tabulars must be small enough to fit on a single page. The current <tt>tabular</> element has been kept quite simple. It certainly does not (yet) offer all the flexibility of &LaTeX;. However, it may well be that it is sufficient for most users. More complex tables can, depending on your choice of formatters, be created using &LaTeX or Unix's <tt/tbl/ program, with the <tt>x</> element, or with any program capable of generating Encapsulated PostScript, which can then be included using an <tt>eps</> element. A <tt>tabular</> consists of a number of rows, separated by the <tt>rowsep</> element, each of which consists of a number of columns separated by the <tt>colsep</> element. The format of the tabular is controlled by the column alignment</> (<tt>ca</>) attribute. For each column in the tabular there is a letter in the <tt>ca</> attribute: 1) <tt>c</> for centered; 2) <tt>l</> for flush left; or 3) <tt>r</> for flush right. In addition, &verbar can be used to insert vertical lines running the complete height of the table. This will be made clear in the example which is coming shortly. First, however, let me describe the short reference map defined for tabulars. Rather than typing <tt><colsep></> and <tt><rowsep></> explicitly, one can just type &verbar to separate columns, and &commat to separate rows. Also, within tabulars, &lsqb can be used to start a mathematical formula, and &dquot starts short quotes as usual. (The other short references just hide any special meaning the character may have to &TeX;.) <code> <!entity % tabrow "(%inline, (colsep, %inline)*)" > <!element tabular - - (%tabrow, (rowsep, hline?, %tabrow)*, caption?) > <!attlist tabular ca cdata #required> <!element rowsep - o empty> <!element colsep - o empty> <!element hline - o empty> <!entity rowsep "<rowsep>"> <!entity colsep "<colsep>"> <!shortref tabmap "&ero;#RE;" null "&ero;#RS;&ero;#RE;" null "&ero;#RS;B&ero;#RE;" null "&ero;#RS;B" null "B&ero;#RE;" null "BB" null "&ero;#SPACE;" null "&ero;#TAB;" null "@" rowsep "|" colsep "[" ftag '"' qtag "_" thinsp "~" nbsp "#" num "%" percnt "^" circ "{" lcub "}" rcub > <!usemap tabmap tabular> </code> The <tt>hline</> element can be use to draw a horizontal line along the length of the table, to separate rows. A <tt/table/ element consists of a <tt>tabular</> followed by an optional <tt>caption</>. Unlikes tabulars, A <tt/table/ is a floating "body", like a figure. It may be moved to another (near) location within the formatted document. A <tt/tabular/, however, appears at the same place in the formatted document as in the SGML source file. <code> <!element table - - (tabular, caption?) > <!attlist table loc cdata "tbp"> </code> Here is how table <ref id="GPC"> was typed: <verb> <table> <tabular ca="ll|ll"> ae | &ero;ae | Ae | &ero;Ae @ oe | &ero;oe | Oe | &ero;Oe @ ue | &ero;ue | Ue | &ero;Ue @ sz | &ero;sz | amp | &ero;amp @ bsol | &ero;bsol | circ | &ero;circ @ . . . Dagger | &ero;Dagger | sect | &ero;sect @ para | &ero;para | copy | &ero;copy @ mdash | &ero;mdash | tilde | &ero;tilde &et;tabular> <caption><label id="GPC"> General Purpose Characters &et;caption> &et;table> </verb> <sect1><heading><label id="litprog">Literate Programming</> The original motivation behind the development of these document types was to create an environment for literate programming in an arbitrary programming language similar to Donald Knuth's WEB system for literate programming in Pascal <cite id="Knuth84">. The basic idea is to include the source code of a program inside of its documentation, instead of the other way around: including comments within the source code. The features offered here to support literate programming, or merely the documentation of existing programs, have been kept to a minimum. Snippets of code can be mentioned within sentences using the <tt>tt</> tag. These are formatted using a <tt>typewriter</> font suitable for program code, but the spacing and indentation of the code is not retained. Within <tt/tt/ elements, the only characters which may not be literally interpreted are &dollar, &bsol, &, and <tt></</>. For the &dollar and &bsol symbols, always use the <tt>dollar</> and <tt>bsol</> entities. For the & and < symbols, use the <tt>amp</> and <tt>lt</> entities if the string in which they occur could be mistaken for an entity reference, an element start tag or an element end tag. To include larger segments of code, retaining its line breaks, tabulation and spacing, use the <tt>code</> tag or the <tt>verb</> tag. Within these tags just about all characters are interpreted literally. The exceptions are: <enum> <item>As SGML entities may be used within <tt>verb</> and <tt>code</> elements, use the <tt>ero</> entity to represent the & symbol in strings which might otherwise be mistaken for entity references. (Notice that the <tt>amp</> entity is not used to represent & in this context.) <item> As there must be some way of ending such elements, use the <tt>etago</> entity to represent <tt>&et</> in strings which might otherwise be interpreted as end tags. (Do not use the <tt>lt</> entity for this purpose here.) Start tags can be typed literally in this context, without using entities. <item>Unfortunately &TeX peeks through a bit here as well; The string <tt>\end{verbatim}</> may not occur within <tt>code</> or <tt>verb</> elements. Presumably this will not often be a problem. </enum> For example, to include the "hello world" C program in a document, just type: <verb> <code> main () { /* This is the famous hello world program */ printf("hello world\n"); } &et;code> </verb> When formatted, spaces and line breaks are preserved: <verb> main () { /* This is the famous hello world program */ printf("hello world\n"); } </verb> Notice that no entities where required in this code. With few exceptions, it should be possible to just wrap <tt>verb</> or <tt>code</> tags around existing pieces of code without change. The idea of literate programming is that the documentation is</> the program, so there must be some way of extracting the source code from the SGML document. Just how to do this is described in chapter <ref id="UC">, below. The user must have a means of indicating which pieces of code are to be included in the source code, and in which order. Our solution to this problem is very simple: Only <tt>code</> elements are to be extracted, and they are extracted in the same order as they appear in the document.</> That is, <tt>verb</> elements are not</> extracted, and may be used, e.g., for examples or draft versions of the code included for explanatory or tutorial purposes. <tt>code</> and <tt>verb</> elements may be formatted differently. Using our translation into &LaTeX, for example, <tt>code</> elements are distinguished by being bracketed by lines the width of the page. <code> <!element code - - rcdata> <!element verb - - rcdata> <!shortref ttmap "&ero;#RS;B" null '#' num '%' percnt '~' tilde '_' lowbar '^' circ '{' lcub '}' rcub '|' verbar > <!usemap ttmap tt> </code> <sect1><heading><label id="math">Mathematical Formulas</> The <tt>qwertz</> document types include elements for describing mathematical formulas completely within SGML, similar to the system described in <cite id="daphne89">. To start, there are a fairly large number of entities for mathematical symbols. (The set of entities chosen are for the symbols available in both &TeX and in the PostScript Symbol font.) Although this may be a minor irritation for seasoned &TeX users, we have decided to follow the naming conventions for mathematical symbols adopted in the SGML Standard <cite id="Smith88">. The complete set of mathematical symbols currently defined, including the Greek alphabet are listed in tables <ref id="mathsym"> and <ref id="greek">, in alphabetical order. <code> <!entity % math system -- math symbols -- > %math; </code> <table> <tabular ca="ll|ll|ll|ll"> Prime | [&Prime] | aleph | [&aleph] | and | [&and] | ang | [&ang] @ ap | [&ap] | arr | [&darr] | bottom | [&bottom] | bull | [&bull] @ cap | [&cap] | cir | [&cir] | clubs | [&clubs] | congr | [&congr] @ cup | [&cup] | diams | [&diams] | divide | [÷] | dot | [&dot] @ empty | [&empty] | equiv | [&equiv] | exist | [&exist] | forall | [&forall] @ ge | [&ge] | hArr | [&hArr] | harr | [&harr] | hearts | [&hearts] @ image | [&image] | infin | [&infin] | isin | [&isin] | lArr | [&lArr] @ lang | [&lang] | larr | [&larr] | le | [&le] | mid | [&mid] @ minus | [&minus] | nabla | [&nabla] | ne | [&ne] | nequiv | [&nequiv] @ not | [¬] | notin | [¬in] | nsub | [&nsub] | nsube | [&nsube] @ nsup | [&nsup] | nsupe | [&nsupe] | nvDash | [&nvDash] | nvdash | [&nvdash] @ oplus | [&oplus] | or | [&or] | otimes | [&otimes] | part | [&part] @ plusmn | [±] | prime | [&prime] | prop | [&prop] | rArr | [&rArr] @ rang | [&rang] | rarr | [&rarr] | real | [&real] | setmn | [&setmn] @ spades | [&spades] | square | [&square] | sub | [&sub] | sube | [&sube] @ sup | [&sup] | supe | [&supe] | times | [×] | uArr | [&uArr] @ uarr | [&uarr] | vDash | [&vDash] | vdash | [&vdash] @ </tabular> <caption><label id="mathsym">Math Symbols</> </table> <table> <tabular ca="ll|ll|ll"> alpha | [&alpha] | beta | [&beta] | gamma | [&gamma] @ Gamma | [&Gamma] | delta | [&delta] | Delta | [&Delta] @ epsi | [&epsi] | zeta | [&zeta] | eta | [&eta] @ thetas | [&thetas] | Theta | [&Theta] | iota | [&iota] @ kappa | [&kappa] | lambda | [&lambda] | mu | [&mu] @ nu | [&nu] | xi | [&xi] | Xi | [&Xi] @ pi | [&pi] | Pi | [&Pi] | rho | [&rho] @ sigma | [&sigma] | sigmav | [&sigmav] | Sigma | [&Sigma] @ tau | [&tau] | upsi | [&upsi] | Upsi | [&Upsi] @ phis | [&phis] | Phi | [&Phi] | chi | [&chi] @ psi | [&psi] | Psi | [&Psi] | omega | [&omega] @ Omega | [&Omega] </tabular> <caption><label id="greek">Greek Letters</> </table> &TeX symbols not in table 2 may nonetheless be generated, by defining an entity using the <tt>mc</> element. For example, to print the <x>$\leadsto$</x> symbol, you could first define an entity, perhaps using the name adopted for this symbol in the SGML standard: <verb> <!entity rarrw "<mc/<x/\leadsto//"> </verb> Of course, this approach is &TeX dependent. But this dependency is clearly noted at the beginning of the document, and it would be an easy matter to replace the &TeX command for such entities with the appropriate commands for some other formatter. The <tt>mc</> tag used in this entity definition is for math characters</>. The entity could have been defined using only the <tt>x</> tag described in section <ref id="misc">, but it is "safer" to use the <tt>mc</> tag when defining entities which are only to be used within formulas, as the SGML parser will complain if they are used elsewhere. If <tt>x</> were used instead, such errors would first be caught by &TeX;. <code> <!element mc - - cdata > </code> There are a number of parameters for formulas. These will most likely be of little interest to most users, but are stated here for the sake of completeness. <code> <!entity % sppos "tu" > <!entity % fcs "%sppos;|phr" > <!entity % fcstxt "#pcdata|mc|%fcs;" > <!entity % fscs "rf|v|fi" > <!entity % limits "pr|in|sum" > <!entity % fbu "fr|lim|ar|root" > <!entity % fph "unl|ovl|sup|inf" > <!entity % fbutxt "(%fbu;) | (%limits;) | (%fcstxt;)|(%fscs;)|(%fph;)" > <!entity % fphtxt "p|#pcdata" > </code> There are three elements for representing formulas: <tt>f</>, for ordinary short formulas appearing "in-line"; <tt>dm</> for displayed formulas</> to be centered on a line (or lines) by themselves; and <tt>eq</> for displayed formulas which are to be numbered sequentially throughout the document (i.e. so-called "equations"). <code> <!element f - - ((%fbutxt;)*) -(footnote) > <!entity fendtag '&et;f>' -- formula end -- > <!shortref fmap "&ero;#RS;B" null "&ero;#RS;B&ero;#RE;" null "&ero;#RS;&ero;#RE;" null "_" thinsp "~" nbsp "]" fendtag "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar> <!usemap fmap f > <!element dm - - ((%fbutxt;)*) -(footnote)> <!element eq - - ((%fbutxt;)*) -(footnote)> <!shortref dmmap "&ero;#RE;" space "_" thinsp "~" nbsp "]" fendtag "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar> <!usemap dmmap (dm,eq)> </code> Usually it is not necessary to type the starting and ending tags of the <tt>f</> element explicitly: &lsqb and &rsqb are short reference delimiters, allowing one to simply type, for example, <tt>[&alpha &rarr &beta]</>, instead of <tt><f>&alpha &rarr &beta</f></tt> to represent [&alpha &rarr &beta].<footnote>&TeX users will appreciate that this notation is no more verbous than &TeX;.</footnote> The only characters of interest in <tt>fmap</> are &lowbar &tilde and ]. &lowbar is a short reference for <tt>thinsp</>, which adds a little extra horiztonal space. &tilde means <tt>nbsp</>, which in turn denotes a non-breaking space. &TeX will not start a new line at a <tt>nbsp</>. Finally, ] is used to end the formula. The other characters in this map just protect us from any special meaning &TeX gives them. The <tt>dmmap</> is much the same as the <tt>fmap</>. There are just two differences: 1) ] is not a short reference for the <tt>f</> closing tag (and instead has its literal meaning), and 2) carriage returns and new lines are replaced by spaces, for reasons having to do with the way &TeX formats formulas. Use the <tt>tu</> element, defined a bit later, to force line breaks in formulas. Of course, formulas consist of more than just a string of math symbols. There are elements for representing fractions (<tt>fr</>), products (<tt>pr</>), integrals (<tt>in</>), sums (<tt>sum</>), roots (<tt>root</>) and arrays (<tt>ar</>). Each of these will be described next. A fraction consists of a numerator (<tt>nu</>) and a denominator (<tt>de</>). For example, [12/37] can be written as: <verb> [<fr><nu>12<de>37&et;fr>] </verb> Of course, this is rather lengthy. For simple fractions such as this, you may prefer to just type <tt>[12/37]</>, which is formatted by &LaTeX in the same way.<footnote>On the other hand, if you are a SGML purist, you may prefer not to do this, as it makes assumptions about the formatting system being used.</footnote> <code> <!element fr - - (nu,de) > <!element nu o o ((%fbutxt;)*) > <!element de o o ((%fbutxt;)*) > </code> Products, integrals and sums all have similiar structure, consisting of a lower limit</> (<tt>ll</>), an upper limit</> (<tt>ul</>) and an optional operand</> (<tt>opd</>). <code> <!element ll o o ((%fbutxt;)*) > <!element ul o o ((%fbutxt;)*) > <!element opd - o ((%fbutxt;)*) > <!element pr - - (ll,ul,opd?) > <!element in - - (ll,ul,opd?) > <!element sum - - (ll,ul,opd?) > </code> So, for example, <dm> <sum><ll>i=1<ul>n<opd>x<inf>i</></sum> = <in><ll>0<ul>1<opd>f</in> </dm> was typed as: <verb> <dm> <sum><ll>i=1<ul>n<opd>x<inf>i&et>&et;sum> = <in><ll>0<ul>1<opd>f&et;in> &et;dm> </verb> This example also shows how to represent subscripts, using the <tt>inf</> tag. There is also a <tt>sup</> tag for superscripts. For operators with upper and lower limits other than products, sums or integrals, use the <tt>lim</> element. <code> <!element lim - - (op,ll,ul,opd?) > <!element op o o (%fcstxt;|rf|%fph;) -(tu) > </code> For example, <dm> <lim><op>&bigcup<ll>i=0<ul>n<opd>{&alpha<inf>i</> &rarr &beta}</lim> </dm> was typed as <verb> <!entity bigcup "<mc>\bigcup&et;>"> ... <dm> <lim>&ero;bigcup<ll>i=0<ul>n&et;> <opd>{&ero;alpha<inf>i&et;> &ero;rarr &ero;beta}&et> &et;lim> &et;dm> </verb> Notice that it isn't necessary to type the <tt>op</> tag here. Roots can be represented using the, what else, <tt>root</> element. By default, <tt>root</> produces square roots. The <tt>n</> attribute of <tt>root</> can be used for other roots. For example, type <tt>[<root n=3/x+y/]</tt> to get [<root n=3/x+y/]. <code> <!element root - - ((%fbutxt;)*) > <!attlist root n cdata ""> </code> Arrays, or matrices, consist of a sequence of rows, each of which contains a sequence of columns. Every row in the array must contain the same number of columns. Rows are separated</> by the <tt>arr</> tag; columns by the <tt>arc</> tag. The array itself is delimited by the <tt>ar</> tag. <code> <!element col o o ((%fbutxt;)*) > <!element row o o (col, (arc, col)*) > <!element ar - - (row, (arr, row)*) > <!attlist ar ca cdata #required > <!element arr - o empty > <!element arc - o empty > </code> This is a place where an SGML short reference map has proven useful: <code> <!entity arr "<arr>" > <!entity arc "<arc>" > <!shortref arrmap "&ero;#RE;" space "@" arr "|" arc "_" thinsp "~" nbsp "#" num "%" percnt "^" circ "{" lcub "}" rcub > <!usemap arrmap ar > </code> Columns can be separated using the &verbar character; rows with the &commat character. For example, this matrix <dm> <ar ca=clcr> a+b+c | uv | x-y | 27 @ a+b | u+v | z | 134 @ a | 3u+vw | xyz | 2,978 </ar> </dm> was typed as: <verb> <ar ca=clcr> a+b+c | uv | x-y | 27 @ a+b | u+v | z | 134 @ a | 3u+vw | xyz | 2,978 &et;ar> </verb> The column alignment</> of an array must be specified using the <tt>ca</> attribute, as shown in the example. For each column in the array, there is a letter in the <tt>ca</> attribute. There are three alternatives: 1) <tt>c</> for centered; 2) <tt>l</> for flush left; and 3) <tt>r</> for flush right. There remain a few miscellaneous math elements to describe. <tt>sup</> and <tt>inf</>, for superscripts and subscripts, were mentioned above. <tt>unl</> and <tt>ovl</> can be used to underline</> or overline</> formulas. <tt>rf</> is used for identifiers, such as function names (e.g. <tt>cos</> or <tt>sin</>) within formulas. Similarly, <tt>phr</> is used to delimit phrases of ordinary text within formulas. (Both of these are necessary, as strings of characters within formulas denote sequences of variables, not words.) The <tt>v</> tag can be used to denote a vector</>, as in [<v>x</>]. Calligraphic characters, such as [<fi>L</>], can be denoted using the <tt>fi</> tag. Finally, line breaks can be inserted into formulas using the <tt>tu</> element. <code> <!element sup - - ((%fbutxt;)*) -(tu) > <!element inf - - ((%fbutxt;)*) -(tu) > <!element unl - - ((%fbutxt;)*) > <!element ovl - - ((%fbutxt;)*) > <!element rf - o (#pcdata) > <!element phr - o ((%fphtxt;)*) > <!element v - o ((%fcstxt;)*) -(tu|%limits;|%fbu;|%fph;) > <!element fi - o (#pcdata) > <!element tu - o empty > <!usemap global (rf,phr)> </code> <sect1>Definitions, Lemmas and Theorems</> There are a number of elements useful for representing definitions</> (<tt>def</>), propositions</> (<tt>prop</>), lemmas</> (<tt>lemma</>), corollaries</> (<tt>coroll</>), proofs</> (<tt>proof</>), and theorems</> (<tt>theorem</>). <code> <!element def - - (thtag?, p+) > <!element prop - - (thtag?, p+) > <!element lemma - - (thtag?, p+) > <!element coroll - - (thtag?, p+) > <!element proof - - (p+) > <!element theorem - - (thtag?, p+) > <!element thtag - - (%inline)> <!usemap global (def,prop,lemma,coroll,proof,theorem)> <!usemap oneline thtag> </code> With the exception of <tt>proof</>, these all have the same structure: an optional <tt>thtag</> followed by some paragraph level elements. Here is an example: <theorem><thtag>Alexander's Theorem</> Let [<fi/G/] be a set of nontrivially achievable subgoals and < an order on [<fi/G/]. < is abstractly indicative if and only if it is a linearization of [<lim>< <ll> <fi/G/ <ul>&ast </lim>]. </theorem> This was typed as: <verb> <theorem><thtag>Alexander's Theorem&et> Let [<fi/G/] be a set of nontrivially achievable subgoals and &ero;lt an order on [<fi/G/]. &ero;lt is abstractly indicative if and only if it is a linearization of [<lim>&ero;lt <ll> <fi/G/ <ul> &ero;ast &et;lim>]. &et;theorem> </verb> <sect1> The <tt>global</> Short Reference Map The <tt>global</> short reference map, which is the default map in effect within <tt>qwertz</> documents, allows the &dquot symbol to be used to start a short quote</> (<tt>sq</>) and &lsqb to start a formula</> (<tt>f</>). Also, &tilde is used for non-breaking spaces. The rest of the short references just serve to hide any special meaning &TeX gives these characters, allowing them to be directly typed without having to use entity references. <code> <!entity qtag '<sq>' > <!shortref global "&ero;#RS;B" null -- delete leading blanks -- '"' qtag "[" ftag "~" nbsp "_" lowbar "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar> <!usemap global qwertz> </code> <sect>Cross References</> Places within a document can be marked using the <tt>label</> element. Labels have an <tt>id</> attribute for naming the label. The SGML parser will check that these identifiers are unique within the document, and that they are referenced. That is, the parser will complain if there is no reference to a label. For this reason, labels should probably be created on demand, rather than in anticipation of the need for a reference to the element. There are two kinds of references: <tt>ref</> for references to the number of some element, such as a section, figure or theorem, and <tt>pageref</>, for references to the number of the page on which the text around the label occurs when the document is printed. Both types of references have an <tt>id</> attribute for stating the identifier of the label being referenced. The number of the element or page will be printed at the place of the <tt>ref</> or <tt>pageref</>. <code> <!element label - o empty> <!attlist label id cdata #required> <!element ref - o empty> <!attlist ref id cdata #required> <!element pageref - o empty> <!attlist pageref id cdata #required> </code> For example, a reference to the section on miscellaneous elements of this manual, section <ref id=misc>, would be typed as: <verb> ... section <ref id=misc>, would be ... </verb> The label itself was typed as: <verb> <sect><heading><label id="misc"> Miscellaneous Elements&et> </verb> <sect><heading><label id="misc">Miscellaneous Elements</> There are just a couple general purpose elements remaining to be discussed, which don't seem to have found a suitable home yet elsewhere in this manual. Editorial comments and reminders to oneself can be marked with the <tt>comment</> tag. These comments will be printed using a different type style than the body of the text. In the <tt>qwertz</> mapping into &TeX, they are printed using the <sl>slanted type style</>. If you do not want the comment to be printed, use the standard SGML notation for comments instead: <tt></tt>. Finally, there is an "escape" element, allowing you to include raw formatting code at any place in your document, the <tt>x</> element. This code will be passed on to the formatter, such as &TeX;, inline, at the point it appears in your document. Of course, this "feature" should be used judiciously, as it limits the formatter independence of the document. <code> <!element comment - - (%inline)> <!element x - - ((#pcdata | mc)*) > <!usemap #empty x > </code> Notice that math character (<tt>mc</>) elements may appear within <tt>x</> elements. This allows you to use SGML entity references for math characters, to help avoid having to rememember both the SGML and the formatter's names for these symbols. Other entities may also be used, so long as they expand to character data. <sect>Articles, Reports and Books</> Articles, reports and books are structurally very similar. They may be formatted differently, of course, but this is of little importance during the writing phase of primary interest to authors. Seen abstractly, each type of document consists of a title page</>, for such information as the title of the document, the names of the authors and so on, followed perhaps by an abstract</>, and then by a sequence of chapters</> or sections</>. There may be citations</>, which are references to documents listed at the end, in a bibliography</>. Perhaps there are one or more appendices</>. Finally, these documents may also contain footnotes</>. Let us first precisely describe the overall structure of these document types, before moving on to describe their various components. The article element is defined as: <code> <!element article - - (titlepag, header?, abstract?, toc?, lof?, lot?, p*, sect*, (appendix, sect+)?, biblio?) +(footnote)> <!attlist article opts cdata "null"> </code> The options</> attribute (<tt>opts</>) of <tt>article</> provides a place to state formatting</> options, which are passed on to &LaTeX;. The particular options available depends on the installation of &LaTeX being used, but the following should always be available: <descrip> <tag><tt>11pt, 12pt.</></tag> Set the "normal" font size to eleven, or twelve, point, instead of the default 10 point size. <tag><tt>twoside.</></tag> Formats the document for printing on both sides of a page. <tag><tt>twocolumn.</></tag> Formats the document with two columns per page, as is common in the proceedings of scientific conferences, for example. <tag><tt>titlepage.</></tag> Causes the title page and abstract to be printed on a separate page. </descrip> Other options which may be supported include: <descrip> <tag><tt>dina4.</></> Formats the document for printing on <bf/DIN A4/ size paper. (As this is the size paper used at our installation, this option is included automatically during the translation.) <tag><tt>german.</></tag> Causes the &TeX hyphenation algorithm to "think German", and sections, bibliographies and such to be labelled using the appropriate German terms. <tag><tt>times, bookman, palatino …</tt></> Causes the "main" font to be the selected PostScript font, instead of the standard &TeX font, Computer Modern, and maps all other type faces to some suitable PostScript font or type style. </descrip> For example, the starting tag for some article might be: <verb> <article opts="bookman,11pt"> </verb> Reports are just like articles, except that they consist of a sequence of chapters (<tt>chapt</>), instead of sections (<tt>sect</>): <code> <!element report - - (titlepag, header?, abstract?, toc?, lof?, lot?, p*, chapt*, (appendix, chapt+)?, biblio?) +(footnote)> <!attlist report opts cdata "null"> </code> Books are similar to reports, except that they may not include an abstract: <code> <!element book - - (titlepag, header?, toc?, lof?, lot?, p*, chapt*, (appendix, chapt+)?, biblio?) +(footnote) > <!attlist book opts cdata "null"> </code> The options attribute (<tt>opt</>) for <tt>report</> and <tt>book</> elements is the same as that for articles, just described, except the <tt>titlepage</> option, which is applicable only for articles. The rest of this chapter describes the common elements of articles, reports and books, starting with title pages. <sect1>Title Pages</> A title page (<tt>titlepag</>) consists of a title, a number of authors (<tt>author</>) and an optional date (<tt/date/). The title may refer to a footnote and may also include a <tt>subtitle</>. If the date element is omitted, today's date will be printed by default. To avoid having a date printed, include an empty <tt/date/ element. <code> <!element titlepag o o (title, author, date?)> <!element title - o (%inline, subtitle?) +(newline)> <!element subtitle - o (%inline)> <!usemap oneline titlepag> </code> The <tt>author</> element includes the <tt>name</> and, optionally, institution (<tt>inst</>) of the author. If there are multiple authors, these are separated with the <tt>and</> tag. Also, acknowledgements can be expressed using the <tt>thanks</> element. These are formatted by &LaTeX as footnotes on the title page.The <tt>author</> element includes the <tt>name</> and, optionally, institution (<tt>inst</>) of the author. If there are multiple authors, these are separated with the <tt>and</> tag. Also, acknowledgements can be expressed using the <tt>thanks</> element. These are formatted by &LaTeX as footnotes on the title page. <code> <!element author - o (name, thanks?, inst?, (and, name, thanks?, inst?)*)> <!element name o o (%inline) +(newline)> <!element and - o empty> <!element thanks - o (%inline)> <!element inst - o (%inline) +(newline)> <!element date - o (#pcdata)> <!usemap global thanks> </code> Within the <tt>titlepag</>, the <tt>title</>, <tt>subtitle</>, <tt>author</> and <tt/inst/ elements can be broken into multiple lines using the <tt>newline</> element or, if you prefer, the <tt>nl</> entity. <code> <!element newline - o empty > <!entity nl "<newline>"> </code> The title page of this manual was typed as: <verb> <title>The <tt/qwertz/ SGML Document Types <subtitle>(Version 1.1 Reference Manual) <author>Tom Gordon <inst> Institute for Applied Information Technology (F3) &ero;nl&ero;nl German National Research Center &ero;nl for Computer Science (GMD) </verb> Notice the <tt>titlepag</> tags are optional. The simplest title page would include a title and author: <verb> <title> A Very Short Title Page <author> Snoopy </verb> <sect1>Abstracts</> Articles and reports, but not books, may have an abstract, which consists of one or more paragraphs, including the various kinds of lists, mathematical formulas and elements for literate programming: <code> <!element abstract - - (p+)> </code> <sect1>Table of Contents</> There are three elements for stating whether or not a table of contents, list of figures or list of tables should be included in the document. These tables and lists are generated by &LaTeX;. Therefore the contents of these elements is empty. They are only used to specify that the list or table should be included. <code> <!element toc - o empty> <!element lof - o empty> <!element lot - o empty> </code> <sect1>Headers</> A <tt>header</> element specifies what should be printed at the top of each page. It consists of a left heading (<tt>lhead</>) and a right heading (<tt>rhead</>). Both elements are required, if a heading is used at all, but either may be left empty, so that the effect of having only a left or right heading can be achieved easily enough. <code> <!element header - - (lhead, rhead) > <!element lhead - o (%inline)> <!element rhead - o (%inline)> </code> As we will see, an initial header can be given after the title page. Afterwards, a new header can be given for each new chapter or section. The header printed on a page is the one which is in effect at the end of the current page. So that the header will be that of the last section starting on the page. <sect1>Sectioning</> The naming scheme we have adopted for sections is a bit different than that of &LaTeX;, because the names of SGML identifiers may be at most only eight characters long. But we think the scheme we have chosen has its advantages. In books and reports, the top-level sectional unit is the chapter</> (<tt>chapt</>). In articles, it is the section</> (<tt>sect</>). The lower sectional units are <tt>sect1</>, <tt>sect2</>, <tt>sect3</>, and <tt>sect4</>, in that order. Each section (or chapter) consists of a <tt>heading</>, followed by an optional <tt>header</>, a number of paragraphs (including such things as graphics), and then sections of the next lower level. <code> <!entity % sect "heading, header?, p* " > <!element heading o o (%inline)> <!element chapt - o (%sect, sect*) +(footnote)> <!element sect - o (%sect, sect1*) +(footnote)> <!element sect1 - o (%sect, sect2*)> <!element sect2 - o (%sect, sect3*)> <!element sect3 - o (%sect, sect4*)> <!element sect4 - o (%sect)> <!usemap oneline (chapt,sect,sect1,sect2,sect3,sect4)> </code> Don't confuse the headers with headings. The <tt>heading</> is just the text printed at the point where the section begins, naming the section. The <tt>header</> changes the text printed at the top of each page. If there are cross references to the section, put the <tt>label</> in the heading. For example, you could type: <verb> <sect><heading><label id=mysect>My First Section&et> </verb> If a label isn't required, you can leave the <tt>heading</> tag implicit: <verb> <sect>My First Section </verb> The <tt>appendix</> element marks the begin of a sequence of appendices. These are chapters or sections, depending on whether the document is an article, report or book, and differ from ordinary chapters or sections only in the way the are numbered, and of course their placement at the end of the document. <code> <!element appendix - o empty > </code> <sect1>Footnotes</> The tag for footnotes is, simply enough, <tt>footnote</>.<footnote>To be sure the marker for the footnote is formatted propertly, be sure not to leave a space between the character after which the footnote marker is to appear and the beginning of the footnote element itself.</> <code> <!element footnote - - (%inline)> <!usemap global footnote> </code> Footnotes can appear anywhere within a section (or chapter). The <tt>usemap</> declaration is required to cancel the <tt>lines</> map used in title pages. <sect1>Citation</> Literature references can be made using the <tt>cite</> and <tt>ncite</> elements. The only difference between them is that the <tt>ncite</> allows a short note</> to be included in the reference, for such things as page numbers. <code> <!element cite - o empty> <!attlist cite id cdata #required> <!element ncite - o empty> <!attlist ncite id cdata #required note cdata #required> </code> For example, one might type <verb> <ncite id="Bryan88" note="pg.68"> </verb> to refer to page 88 of Martin Bryan's book on SGML. This would appear, using &LaTeX, as <ncite id="Bryan88" note="pg. 68"> in the printed document. The <tt>id</> attribute of a <tt>cite</> or <tt>ncite</> is a reference to an identifier of a Bib&TeX bibliography file. There is a <tt>qwertz</> SGML document type for creating such bibliographies, described below. The bibliography itself, or list of references, is generated by including a <tt>biblio</> element near the end of the document, before the appendix. <code> <!element biblio - o empty> <!attlist biblio style cdata "qwertz" files cdata ""> </code> The <tt>files</> attribute of <tt>biblio</> is a list of the names of the bibliographies used, separated by commas. The names should not include any file suffixes, such as ".bib" or ".sgml". For example, to cite publications on artificial intelligence and cognitive science, where the bibliograhies are maintained in two files, <tt>ai.sgml</> and <tt>cogsci.sgml</>, you would type: <verb> <biblio files="ai,cogsci"> </verb> The <tt>style</> attribute determines how the bibliography is formatted. Five styles are supported: <descrip> <tag><tt>plain</> Entries are sorted alphabetically and labeled with numbers. <tag><tt>unsrt</> The same as <tt>plain</> except the entries are ordered as they appear in the document, rather than alphabetically. <tag><tt>alpha</> The same as <tt>plain</>, except that labels are made from the author's name and the year of publication. <tag><tt>abbrv</> The same as <tt>plain</> except that first names, month names, and journal names are abbreviated. <tag><tt>qwertz</> The same as <tt>plain</> except that all words of the entry are capitalized exactly as they appear in the source file of the bibliography. The <tt>plain</> style applies capitalization rules which are inappropriate, e.g., for German titles. </descrip> <sect>Slides The <tt>slides</> element is for making a series of slides or, more commonly, overhead transparencies. Although you may often prefer to use some other program for preparing presentations, this approach has its advantages when you want to include parts of an existing article or book on your transparencies. You can just "cut and paste" the SGML source from an article onto a slide. You may also prefer this approach if your presentation includes mathematical formulas, to be able to take advantage of &TeX's excellent mathematics typesetting. <code> <!element slides - - (slide*) > <!attlist slides opts cdata "null"> </code> Each slide consists of an optional title, followed by one or more <tt>slpar</> elements: <code> <!element slide - o (title?, p+) > </code> Notice that not every element available in an article or book is also available here. In particular, there are no sectioning elements, cross references, footnotes or a bibliography.<footnote>Our translation into &TeX does not use Sli&TeX;, so as to allow slides to include tables and figures. </footnote> The <tt>title</> element will be centered on the line. You can break up the title into multiple lines with <tt>newline</> elements. The various type style elements, such as <tt>em</> and <tt>bf</>, can also be used here; indeed anywhere on a slide. <sect> Letters and Electronic Messages The <tt>letter</> element is for making letters and e-mail messages. Just how a letter is formatted may depend on whether it is a business or personal letter. If it is a business letter, it may be printed to appear as if the company's letterhead stationery had been used. The structure of a letter can be quite complex, but most the elements to be described here are optional. Using an example from <cite id="Lamport86">, a simple letter would be typed like this: <verb> <letter> <from> R. (Ma) Dillo <address> 1234 Ave.~of the Armadillos &ero;nl Gnu York, G.Y. 56789 <to> Dr.~G. Nathaniel Picking <address> Acme Exterminators &ero;nl 33 Swat Street &ero;nl Hometown, Illinois 62301 <cc> Jimmy Carter &ero;nl Richard M. Nixon <opening> Dear Nat, I'm afraid that the armadillo problem is still with us. I did everything ... ... and I hope we can get rid of the nasty beasts this time. <closing> Best regards, &et;letter> </verb> The <tt>from</> and <tt>to</> elements are for the sender's and receiver's names and addresses, respectively. The address may be either a street address, using <tt/address/, or an electronic mail address, using <tt/email/, or both. You may also include a telephone number, using the <tt/phone/ element. (If you are using your company's letterhead stationery, it may be that you should type only your extension, rather than your complete telephone number.) Finally, a telefax number can be provided, using the <tt/fax/ element. Notice that in the <tt>closing</> you must type a comma yourself, if you want one. Also, do not type your name again after the closing; the <tt>name</> of the sender will be printed after the closing as expected. There are several optional elements which may be of interest: <descrip> <tag><tt>subject</> For the purpose or, well, subject of the letter. If you would like this subject line to appear as "re: &hellip", for example, you must type the "re: " yourself, as part of the subject. <tag><tt>sref, rref, rdate</> These are tags for the sender's reference</>, receiver's reference</> and receiver's date</> where you can include whatever code is used by your, or the recipient's, company or institution to uniquely identify letters. For example, if this letter is a response to some other letter, you may use the <tt>rref</> and <tt>rdate</> elements to identify the original letter. There is no <tt>sdate</> tag, as the date this letter is printed will be included in the letter at some appropriate place by the formatter. <tag><tt>cc</> This used to be an acronym for "carbon copies", which were to be sent to persons other than the principal recipient of the letter. The <tt>cc</> tag can be used to list these other recipients, even though the copies they receive today are perhaps printed by a laser printer on recycled paper. As in the above example, you can separate the names of these recipients with <tt>newline</> elements (using the <tt>nl</> entity if you prefer). <tag><tt>encl</> Use this tag to list enclosures</>. These can also be separated with <tt>newline</> elements, or simply with commas, if you prefer. <tag><tt>ps</> A postscript, not to be confused with PostScript, can be included with this tag. Any kind of element which can appear in the body of the letter (i.e. <tt>sectpar</> elements) can also be used here. </descrip> To summarize, here are the relevant SGML declarations: <code> <!entity % addr "(address?, email?, phone?, fax?)" > <!element letter - - (from, %addr, to, %addr, cc?, subject?, sref?, rref?, rdate?, opening, p+, closing, encl?, ps?)> <!attlist letter opts cdata "null"> <!element from - o (#pcdata) > <!element to - o (#pcdata) > <!usemap oneline (from,to)> <!element address - o (#pcdata) +(newline) > <!element email - o (#pcdata) > <!element phone - o (#pcdata) > <!element fax - o (#pcdata) > <!element subject - o (%inline;) > <!element sref - o (#pcdata) > <!element rref - o (#pcdata) > <!element rdate - o (#pcdata) > <!element opening - o (%inline;) > <!usemap oneline opening> <!element closing - o (%inline;) > <!element cc - o (%inline;) +(newline) > <!element encl - o (%inline;) +(newline) > <!element ps - o (p+) > </code> <sect> Telefax Messages The structure of a telefax message is the same as for letters and e-mail messages, except that the <tt/fax/ number of the recipient is, of course, required, rather than optional. <code> <!element telefax - - (from, %addr, to, address, email?, phone?, fax, cc?, subject?, sref?, rref?, rdate?, opening, p+, closing, ps?)> <!attlist telefax opts cdata "null" length cdata "2"> </code> <sect> Notes The <tt/notes/ element is a new top-level document "style", like articles, books and letters. It is useful for miscellaneous purposes, such as jotting down notes to oneself, where the complex structure of the other styles is unnecessary. Notes here simply a sequence of section paragraphs (i.e. paragraphs, lists, comments, long quotations, figures, tables, displayed mathematical formulas, and program code). An optional title is also available. The contents of a notes document can be copied and pasted into a section or chapter of a book or article. <code> <!element notes - - (title?, p+) > <!attlist notes opts cdata "null" > </code> <sect> Manual Pages The <tt>manpage</> element is for Unix manual pages. Here we see again an advantage of SGML. Using this element, the very same manual page can be viewed on just about every terminal, using <tt>nroff</>, or be included as a section of an article, report or book to be formatted by &TeX;. <code> <!element manpage - - (sect1*) -(sect2 | f | %mathpar | figure | tabular | table | %xref | %thrm )> <!attlist manpage opts cdata "null" title cdata "" sectnum cdata "1" > </code> A manpage consists of a sequence of sections. There are two SGML attributes, for the command name and manual section number, respectively. Each section of the manual page is delimited by a <tt>sect1</> element. <em/Notice that these sections may not contain further subsections./ Sections are represented as <tt>sect1</> elements, rather than <tt>sect</>, to allow the manual page to be easily cut and pasted into a <tt>sect</> section of an article, report or book. (Of course, if the manual page is to be used a chapter of a book, then these sections of the manual page will need to be replaced with <tt>sect</> elements.) Notice that Many elements, such as tables, figures and mathematical formulas, cannot be used within manual pages, because of limitations of ASCII terminals, or the Unix <tt/man/ macro package for <tt/nroff/. There is a short reference map in effect within the scope of the <tt>manpage</>. With the exception of [, which is not used here to start formulas, this map has the same effect</> as the <tt>global</> map. <code> <!shortref manpage "&ero;#RS;B" null '"' qtag "[" ftag "~" nbsp "_" lowbar "#" num "%" percnt "^" circ "{" lcub "}" rcub "|" verbar> <!usemap manpage manpage > </code> <sect1> Manual Page Conventions For detailed information about the conventions for Unix manual pages, see your Unix documentation. But here is a brief summary. The typical manual page has the following sections, in this order: <descrip> <tag> NAME. The name, or list of names, by which the command or function is called, followed by a dash and then a one-line summary of its purpose. <tag> SYNOPSIS. For the syntax of the command and its arguments. (The Sun documentation suggests that literals be formatted using boldface type, and that variables be formatted using italics type. Use the <tt>tt</> and <tt>em</> elements, respectively, here for this purpose.) <tag> DESCRIPTION. An overview of the command or function's purpose, effects and use. <tag> OPTIONS. A list and description of all command-line options. <tag> FILES. A list of files associated with the command which may be of interest to users. <tag> SEE ALSO. A comma-separated list of related Unix commands, and references to other relevant publications. <tag> DIAGNOSTICS. A list and explanation of any diagnositic messages the command may write to the standard error output file. <tag> BUGS. A description of any known bugs, problems, or limitations. </descrip> Some of you may be asking yourselves why <tt>manpage</> wasn't designed so that each of these conventional sections of a manual page is represented by its own SGML element. That certainly would have been possible, but on the other hand the approach taken has the advantage that users can simply cut and paste sections between manual pages and article, reports and books. Of course it would have been easy to write a filter to convert between these formats, but it was felt that the benefits of a special <tt>manpage</> format would be too small to warrant even this limited effort. After all, unless one is using an SGML structure editor, users must refer to the SGML document type definition to know what is expected in the manual page. It is just as easy to check this documentation to see what sections conventionally appear in manual pages. There is also a file which can be used as a template or form for writing manual pages. See the Unix Commands chapter for details. The only reason there is a <tt>manpage</> document type, instead of just another translation of, say, the <tt>article</> document type into <tt>nroff</> is that the <tt>man</> macros used for the Unix documentation are not powerful enough to format all of the features available in our <tt>latex</> document type. Having this separate <tt>manpage</> document type provides a means of checking whether the manual page can be formatted by <tt>nroff</> using these <tt>man</> macros. Again, as this document type is designed to be a subset of the <tt>latex</> document type, the sections of a manual page can also be included within instances of the <tt>latex</> document type. <sect1> Manual Page Example Here is how the manual page for the <tt>cd</> command could have been typed using this document type definition: <verb> <manpage title="CD"> <sect1> NAME cd &ero;mdash change working directory <sect1> SYNOPSIS cd [ directory&et;> ] <sect1> DESCRIPTION directory&et;> becomes the new working directory. The process must have execute (search) permission in directory&et;>. If cd is used without arguments, it returns you to your login directory. ... <sect1> SEE ALSO csh(1), pwd(1), sh(1) &et;manpage> </verb> This is the end of the <tt>qwertz</> document type definition. <code>  </code>